bert -base
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Dominican Republic (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- (5 more...)
Birder: Communication-Efficient 1-bit Adaptive Optimizer for Practical Distributed DNN Training
Therefore, from a system-level perspective, the design ethos of a system-efficient communication-compression algorithm is that we should guarantee that the compression/decompression of the algorithm is computationally light and takes less time, and it should also be friendly to efficient collective communication primitives.
- North America > Canada (0.04)
- Asia > China (0.04)
BayesQ: Uncertainty-Guided Bayesian Quantization
Lamaakal, Ismail, Yahyati, Chaymae, Maleh, Yassine, Makkaoui, Khalid El, Ouahbi, Ibrahim
We present BayesQ, an uncertainty-guided post-training quantization framework that is the first to optimize quantization under the posterior expected loss. BayesQ fits a lightweight Gaussian posterior over weights (diagonal Laplace by default; optional K-FAC/low-rank), whitens by the posterior covariance, designs codebooks to minimize posterior-expected distortion, and allocates mixed precision via a greedy knapsack that maximizes marginal expected-loss reduction per bit under a global budget. For scalar quantizers, posterior-expected MSE yields closed-form tables; task-aware proxies are handled by short Monte Carlo on a small calibration set. An optional calibration-only distillation aligns the quantized model with the posterior predictive teacher. At matched average bits/weight of 3.0/3.5/4.0, BayesQ improves over strong PTQ baselines on ResNet-50 (ImageNet) and BERT-base (GLUE) e.g., vs. GPTQ by $+1.5/+0.7/+0.3$ top-1 percentage points on RN50 and $+1.1/+0.4/+0.2$ GLUE points on BERT, while requiring one-time preprocessing comparable to a GPTQ pass. BayesQ reframes low-bit quantization as uncertainty-aware risk minimization in a practical, post-training pipeline.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Lightweight Baselines for Medical Abstract Classification: DistilBERT with Cross-Entropy as a Strong Default
Liu, Jiaqi, Wang, Tong, Liu, Su, Hu, Xin, Tong, Ran, Wang, Lanruo, Xu, Jiexi
The research evaluates lightweight medical abstract classification methods to establish their maximum performance capabilities under financial budget restrictions. On the public medical abstracts corpus, we finetune BERT base and Distil BERT with three objectives cross entropy (CE), class weighted CE, and focal loss under identical tokenization, sequence length, optimizer, and schedule. DistilBERT with plain CE gives the strongest raw argmax trade off, while a post hoc operating point selection (validation calibrated, classwise thresholds) sub stantially improves deployed performance; under this tuned regime, focal benefits most. We report Accuracy, Macro F1, and WeightedF1, release evaluation artifacts, and include confusion analyses to clarify error structure. The practical takeaway is to start with a compact encoder and CE, then add lightweight calibration or thresholding when deployment requires higher macro balance.
- North America > United States > California > Orange County > Irvine (0.14)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > Michigan (0.04)
- North America > United States > Florida > Hillsborough County > University (0.04)
- Health & Medicine > Therapeutic Area (0.68)
- Health & Medicine > Diagnostic Medicine (0.68)
FedHFT: Efficient Federated Finetuning with Heterogeneous Edge Clients
Ilhan, Fatih, Tekin, Selim Furkan, Huang, Tiansheng, Liu, Gaowen, Kompella, Ramana, Eisenhauer, Greg, Lin, Yingyan Celine, Pu, Calton, Liu, Ling
Fine-tuning pre-trained large language models (LLMs) has become a common practice for personalized natural language understanding (NLU) applications on downstream tasks and domain-specific datasets. However, there are two main challenges: (i) limited and/or heterogeneous data for fine-tuning due to proprietary data confidentiality or privacy requirements, and (ii) varying computation resources available across participating clients such as edge devices. This paper presents FedHFT - an efficient and personalized federated fine-tuning framework to address both challenges. First, we introduce a mixture of masked adapters to handle resource heterogeneity across participating clients, enabling high-performance collaborative fine-tuning of pre-trained language model(s) across multiple clients in a distributed setting, while keeping proprietary data local. Second, we introduce a bi-level optimization approach to handle non-iid data distribution based on masked personalization and client clustering. Extensive experiments demonstrate significant performance and efficiency improvements over various natural language understanding tasks under data and resource heterogeneity compared to representative heterogeneous federated learning methods.
- North America > United States > Texas > Travis County > Austin (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario (0.04)
- (8 more...)
- Energy (0.46)
- Information Technology > Security & Privacy (0.34)
Saten: Sparse Augmented Tensor Networks for Post-Training Compression of Large Language Models
Solgi, Ryan, Zhen, Kai, Swaminathan, Rupak Vignesh, Susanj, Nathan, Mouchtaris, Athanasios, Kunzmann, Siegfried, Zhang, Zheng
The efficient implementation of large language models (LLMs) is crucial for deployment on resource-constrained devices. Low-rank tensor compression techniques, such as tensor-train (TT) networks, have been widely studied for over-parameterized neural networks. However, their applications to compress pre-trained large language models (LLMs) for downstream tasks (post-training) remains challenging due to the high-rank nature of pre-trained LLMs and the lack of access to pretraining data. In this study, we investigate low-rank tensorized LLMs during fine-tuning and propose sparse augmented tensor networks (Saten) to enhance their performance. The proposed Saten framework enables full model compression. Experimental results demonstrate that Saten enhances both accuracy and compression efficiency in tensorized language models, achieving state-of-the-art performance.
- Africa > Senegal > Kolda Region > Kolda (0.04)
- North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
- Asia > Middle East > Jordan (0.04)
- North America > Canada (0.04)
- Asia > China (0.04)